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Abstract 

We produce a series of results extending information-theoretical inequalities (discussed 
by Dembo-Cover-Thomas in 1989-1991) to a weighted version of entropy. The resulting 
inequalities involve the Gaussian weighted entropy; they imply a number of new relations 
for determinants of positive-definite matrices. 


1 Introduction 

The aim of this paper is to give a number of new bounds involving determinants of positive- 
definite matrices. These bounds can be considered as generalizations of inequalities discussed 
in [21 Ej. A common feature of determinant inequalities (DIs) from [21 OJ is that most of them 
have been previously known but often proven by individual arguments (see the bibliography in 
mm)- The unifying approach adopted in mm emphasized their common nature connected 
with/through information-theoretical entropies. 

The bounds presented in the current paper are also obtained by a unified method which is 
based on weighed entropies (WEs), more precisely, on Gaussian WEs. Hence, we speak here 
of weighted determinant bounds/inequalities. The weighted determinant inequalities (WDIs) 
offered in the present paper are novel, at least to the best of our knowledge. Moreover, when 
we choose the weight function to be a (positive) constant, a WDI become a ‘standard’ DI. In 
fact, the essence of this work is that we subsequently examined DIs from mm for a possibility 
of a (direct) extension to non-constant weight functions; successful attempts formed the present 
paper. This reflects a particular feature of the present paper: a host of new inequalities are 
obtained by an old method while mm re-establish old inequalities by using a new method. 

As a primary example, consider the so-called Ky Fan inequality. (We follow the terminology 
used in mmm .) This inequality asserts that 5( C) := log detC is a concave function of a 
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positive-definite d x d matrix C. In other words, for all strictly positive-definite d x d matrices 
Ci, C 2 and Ai, A 2 > 0 with Ai + A 2 = 1, 


^(AiCi + A 2 C 2 ) — A 1 <5(C 1 ) — A 2 (5(C 2 ) > 0; equality iff AiA 2 = 0. (1.1) 

For original ‘geometric’ proofs of m and other related inequalities, see Ref [5] and the bibli¬ 
ography therein. In [21 |5l [3] the derivation of (11.11) occupies few lines and is based on the fact 
that under a variance constraint, the differential entropy is maximized at a Gaussian density. 

A weighted Ky Fan inequality (11.21) has been proposed in [9], Theorem 3.2; the derivation 
is also short and based on a maximization property of the weighted entropy (cf. Theorem 13.11 
below). Namely, given Ci, C 2 and Ai, A 2 as above and a nonnegative function x G M. d t—> 
positive on an open domain in M d , assume condition (11.61) . Then 


a( A 1 C 1 + A 2 C 2 ) — Ai<t(Ci) — A 2 u(C 2 ) > 0; equality again iff A 1 A 2 = 0. 
Here, for a strictly positive-definite C, the value <r(C) = a^C) is as follows: 

lo S lw/^Non 


M C ) = a<t> ^ log (2vr) d (detC) 


+ 


trC-^ c , 0 := fcJ(/S°). 


Next, a^(C) > 0 and positive-definite matrix 3>c ,<j> are given by 

M C ) = J ^( x i)/c°( x i) dx i> = j A (xf) *(x?)/g“(x?)dx?, 


( 1 . 2 ) 


(1.3) 


(1.4) 


and /q° stands for a normal probability density function (PDF) with mean 0 and covariance 
matrix C: 



(27r) rf / 2 (det C) 


1/2 


Tp-l 


exp —- x 1 C 


x 


X = 


/ Xi N 


\%r 


G 


(1.5) 


In terms of a multivariate normal random vector Xf 


E 


X?) 


fc 


No. 


MC) = E0(X?) and ^, c = 
X^ (X^) T j . In (11.51) and below we routinely omit the indices in the notation like x^ 
and Xf. The quantity /ij(/p°) = — f ^(x)/^°(x) log /^°(x)dx is the weighted entropy of /£J° 


with weight function qb, a concept analyzed in detail below. For </>(x) = 1, coincides 

with a ‘standard’ (differential) entropy of a normal PDF. 

The assumption upon Ci, C 2 and Ai, A 2 consists of two bounds and reads 


Aia(Ci) + A2Ct(C2) — a(AiCi + A 2 C 2 ) > 0, 
Aia(C!) + A2o:(C2) — a(AiCi + A 2 C 2 ) 

x log {(27r) d [det (A 1 C 1 + A 2 C 2 )] } + tr [(A 1 C 1 + A 2 C 2 ) 


( 1 . 6 ) 




< 0 


where matrix A = Ai$Ci + ^2^c 2 ~ 3»AiCi+a 2 c 2 - Bounds (11.61) have opposite directions and 
stem from the weighted Gibbs inequality. Cf. Eqns (1.3), (3.3) from Ref [9] and (13.11) . (13. 5 p 
from Section [3] below. 
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When 0 (x) = 1, Eqn (11.61) is satisfied: we have equalities. In this case the weighted Ky Fan 
inequality ( 11 . 21 ) transforms into m- In general, condition (11.61) is not trivial: in a simplified 
case of an exponential weight function </>(x) = exp (t^x), t £ M d , it has been analyzed, both 
analytically and numerically, in [TO]. (Here, </>(x) = 1 means t = 0.) As was shown in HD3, 
for given Ci, C 2 , Ai, A 2 and <f> (that is, for a given t), Eqn (11.61) may or may not be fulfilled. 
(And when (11.61) fails, (11.21) may still hold true.) Moreover, when (11.61) holds, it may or may 
not produce a strictly positive expression in the RHS of bound (11.11) . (Thus, in some cases we 
can speak of an improvement in the Ky Fan inequality.) See Ref [10] . We believe that further 
studies in this direction should follow, focusing on specific forms of weight function i j>. 

In our opinion, this paper paves way to a similar analysis of the whole host of newly es¬ 
tablished WDIs. These inequalities should be taken with a justified degree of caution: offered 
sufficient conditions (stated in the form of bounds involving various weight function) may fail 
for particular Ci, C 2 , Ai, A 2 , and cj>, and a given WDI may or may not yield an improvement 
compared to its ‘standard’ counterpart. For reader’s convenience we list the sufficient conditions 
figuring across the paper: Eqns (12.81) . (12.201) . (13.11) . (13.51) . (14.11) . (14.41) (14.71) . (14.101) . (15.31) . (15.121) . 

(irm (521, (El, EH, EH and (Iqttd . 

The presented WDIs generalize what is sometimes called elementary information-theoretic 
inequalities. An opposite example is the entropy-power inequality; and related bounds. Here 
the intuition is more intricate; some initial results have been proposed in m- 

The paper is organized as follows. In Section [2] we work with a general setting, elaborating on 
properties of weighted entropies which have been established earlier in [9]. Section [3] summarizes 
some properties of Gaussian weighted entropies while Section 0] analyzes the behavior of weighted 
entropies under mappings; these sections also rely on Ref. [9]. The WDIs are presented in 
Sections [5] and [ 6 ] as a sequel to the material from Section [2] |4j Again, for reader’s convenience 
we list them here as Eqns El, EH, EH , EH, EH , EH, El, ETfll and EH). 


2 Random strings and reduced weight functions 

The WE of a probability distribution was introduced in late 1960s - early 1970s; see, e.g., [T]. 
(Another term that can be used is a context-dependent or a preferential entropy.) The reader 
is referred to [9J where a number of notions and elementary inequalities were established for 
the WE, mirroring well-known facts about the standard (Shannon) entropy. We also use Refs 
EE as a source of standard inequalities which we extend to the case of the WE. To keep 
pre-emptiveness, we follow the system of notation from mum with minor deviations. 

Let us begin with general definitions. The WE of a random element X taking values in a 
standard measure space (SMS) v) with a weight function (WF) x € X i->- < f>(x) > 0 is 

defined by 

hJ{X) = h%(f) = E (<j>(X) log f(X)) = - J cj>(x)f(x) log f(x)v(dx), ( 2 . 1 ) 

x 

assuming that 4> is measurable and the integral is absolutely convergent. Here / = fx is the 
probability mass/density function (PM/DF) of X relative to measure u. Symbol E stands for 
the expected value (relative to a probability distribution that is explicitly specified or emerges 
from the context in an unambiguous manner). 
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A number of properties of the WE are related to a Cartesian product structure. Let random 
elements Xi,...,X n be given, taking values in SMSs (A), Wli, z/j), 1 < i < n. Set X™ := 
{Xi,..., X n } and assume that X \,..., X n have a joint PM/DF fx n (x?), x? G A” := x A,;, 

— 1 l<i<n 

relative to the measure id 1 := x Vj ; for brevity we will sometimes set fx n = /• The joint WE 

l<i<n — 1 

of string X” is defined as 

KW = —E (</>(X”) log /(X")) =-J </>(£)!& log /(x?K(dx?). 


( 2 . 2 ) 




Given a set S' C I := {1,2,..., ?z}, write 

X(S), X(S^) for strings {Xj : i G S}, {X, : z G S^}, respectively, where = I \ S. (2.3) 
Next, let x(S) and x(S^) stand for 


{xi : i G S} G X(S) := x X % and {xj : z G S^} G X(S^) := x X t . 

ies i£s c 


(2.4) 


Accordingly, the marginal PD/MF fx(s)( x {S)) emerges, for which we will often write fs(x(S)) 
or even /(x(S)) for short. Furthermore, given a WF x” H > < />(x") > 0, we define the function 

ip{S) : x(S) !->■ ip(S'x(S)) > 0 involving the conditional PM/DF /\-(,sC)|x(s) ( x (^) |^(<S , )^ : 


(2.5) 


^(S;x(S))= J <t>{Xi)fx( S C)\x( S )[x(S )|x(S)J^ ( 5 C)(dx(S )) 
*(S C ) 


where v X (s c ) := x t'j. For brevity we again write sometimes f s c\ s instead of fx(s c )\x(S) 

or omit subscripts altogether. We also write dx(S) and dx(S^) instead of Kv(sj(dx(S)) and 
zy A’(S)(dx(S 1 ^)) and dx instead of z^”(dx”). 

Function z/(S; ■) will play the role of a reduced (or induced) WF when we pass from X/ to 
a sub-string X(S). More precisely, set 


h ™(S)& S )) = - E (V , (5'; X_(S) log fs(S] X(S))) 

= - f ^(S;x(S))f s (x(S))\og f s (x(S))dx(S), 

X(S) 

with i'x(S) := x u i- Cf. [9j. Next, for k = 1,... ,n define 


( 2 . 6 ) 


ies 


7 w ,n I ^ 

=(k 


-i 


E 

SCI: #(S)=fc 


*%S)(X(S)) 


(2.7) 


(Here and below, #(S) and #(S^) are the cardinalities of S and S^.) Here /i/’ n renders the 
averaged WE (per string and per element) of a randomly drawn /c-elemerrt sub-string in X/. 
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In what follows we use the concepts of the conditional and mutual WE and their properties; 
cf. [9]. These objects are used with a host of WFs, depending on the context. Consider the 
following condition: 


V i 6 S C I, with S i = {j € S : j < i} and Sf = {j £ S : j > i}, 

j if(S;x(S)){f(x(S)) - f(x(Sf)) x f(xi\x(Sf))f(x(Sf)\x(Sf))}dx(S)>0, (2-8) 


X(S) 


with standard agreements when one of the sets Sf = 0. Pictorially, Eqn (12.811 is an extension 
of bound (1.27) from [9f; it means that for all i £ S C I, the induced WF f>(S] •) is correlated 
more positively with the marginal PM/DF fs(x(S)) than with the dependence-broken product 


fs~(x( S i )) x [f{i}\sr( x i\x( s i ))fs+\Sr(x( S i~)\x( S i )) 
same property is Eqn (12.201) below. 


Another version of (essentially) the 


Remark 2.1 The special choice of sets Sf is not particularly important: it can be a general 
partition of S \ {i} allowing us to use the chain rule for the conditional WE (see below). 


Theorem 2.2 (Cf. [2], Lemma 7 or [5], Theorem 1.) Let hf' n be defined as in (HID and assume 
(12.81) . Then 


h ^n > h w,n > > > j^w,n' ( 2 . 9 ) 

Proof. Begin with the last inequality, hfff\ > hZ’ n ■ Let 1 < i < n and choose S = I, 
Sf = If := {1,..., i — 1} and Sf = if := {i + 1,..., n}, with {i} C = Iff U if (cf. (j2.3|) , (12.4ft ). 
Then the condition 


J fix) - fxf-Affi 1 )/(£i +1 |£® 1 l ) d£ > 0 (by virtue of 




yields: 

hy(Xf) = h^XfXdif) + h^ m (X({if)) by the chain rule 

< l^-fXfXf^) + ^ ({ . }C) ®W C )) by Lemma 1.3 from [|9J. 

Here reduced WFs ip({i}^) and if{Iff) are calculated according to the recipies in (12.51) . (12.61) . 
Taking the sum, we obtain: 

n n 

£ E +E '‘Ln wr')• ^ 

i =1 i= 1 


n 

By using the chain rule, Ki,i r~\(Xj\X l ~ 1 ) = hf(Xf). Hence, Eqn (12.101) becomes 

i =1 


(n- l)^(^)<^h; ({i}C) (X({z} C ). 

2—1 
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Consequently, 


< £ 

2=1 


^(PrtfSW 0 ) 


n — 1 


( 2 . 11 ) 


which yields that h™’™i 1 > /in’". 

This argument can be repeated if we restrict the WE and the PM/DF to a /c-element subset 
S = {ii,... ,ik} C I listed in an increasing order of its points and perform a uniform choice over 
its (k — l)-elements subsets. Condition (12.81) yields the bound 

% h t(3) (x(S)) < - 2 ^- — x -■ 

ies 

Hence for each fe-element subset, > l\ )k . Therefore, the inequality remains true after 
taking the average over all fc-element subsets drawn uniformly. ■ 

In Theorem 12.31 we extend the result of Theorem 12.21 to exponents of WEs for sub-strings in 

X{. 


Theorem 2.3 (Cf. [2] , Corollary of Lemma 7 or [5], Corollary 1) Given r > 0, define: 

r h^ s) (x(s)y 


w ,n I Tl 

9k ~ 1 k 


E exp 


SCI: #(S)=k 


k 


( 2 . 12 ) 


Then, under assumption (ESI). 


W .71 W,71 \ \ w ,/t \ 

gfi >g 2 > ■> g n -i > ft 


w,n ^ ,n 
n 


(2.13) 


Proof. Again, it is convenient to start with the last bound in (12.131) . As in [2], multiply 
Eqn (12.111) by r, exponentiate and apply the arithmetic-geometric mean inequality to obtain 
g™-i > 9n’ n . The result is then completed with the help of same argument as in the proof of 
Theorem 12.21 ■ 

In Theorem 12.41 we analyse the averaged conditional WEs for sub-strings in X 


Theorem 2.4 (Cf. [5], Theorem 2.) Let p^’ n be defined as 


w,n 

Pk 



£ 


SCI : #(S)=fc 


/i£(X(S)|A(S c )) 

k 


(2.14) 


Then under the assumption 


1 


x? 


fi{x) 


2=1 


dx > 0 


(2.15) 


we have that 


W ,71 / W ,72 ^ ^ W ,72 

P 1 < P2 < ■< P n -1 


SP ri 


(2.16) 
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Proof. Following the argument used in [9], Theorem 3.1, condition (12.151) yields 

n 

< E %<»«)■ 

1=1 

Subtracting both sides from nh^(X^), we obtain: 

n 

(n - 1 )h%(jq) > J2 [/#(*?) - %»(**)], 

i= 1 

By the conditional WE definition, 

Xi) + f% m (Xi). 

Hence, 

n 

(n-l)hj(g) i|W). (2.17) 

i=1 

Dividing (12.171) by n(n — 1) yields that — Pn’ n - Finally, applying the same argument as in 
Theorem 12.21 completes the proof. ■ 

The next step is to pass to mutual WEs. 


Theorem 2.5 (Cf. [5], Corollary 2.) Consider the averaged mutual WE between a subset (or a 
sub-string) and its complement: 


w ,n 

% = 

and assume (12.81) . Then 



E 


SCI: #(S)=k 


^(X(5):X(5 C )) 

k 


(2.18) 


w,n ^ w ,n \ w ,n 

( H >Q2 > ■> Qn-l 


> Qn 


(2.19) 


Proof. The result is straightforward, from Theorems 12.21 and 12.41 and the following relation 
between conditional and mutual WEs: 

'1(XW :^ C )) =%)®5))-^(^)|X(5 C )). 


In Theorem 12.61 we consider the following condition: for all set S with # S > 2 and i,j € S 
with i 7 ^ j, 


j cj>(x)f(x(S Z )\x(S))[f(x(S)) 


X? 


( 2 . 20 ) 


~f{x(S\{i,j})) f(xi\x(S\{i,j})) f(xj\x(S\{i,j})) 


dx > 0. 


The meaning of (12.201) is that for all S and i,j as above, the reduced WF ips(x(S)) is correlated 
more positively with /(x(5)) than with the PM/DF f(x_(S\{i,j})) f(xi\x(S\{i,j })) f(xj\x_(S\ 
{i,j})) where the conditional dependence between X\ and Xj is broken, given X_(S \ 
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Theorem 2.6 (Cf. [5], Theorem 3.) Define the average mutual WE as 


T w,n _ 
1 k ~ 


-l 


SC/ : #(S)=fc 


( 2 . 21 ) 


By symmetry of the mutual WE, If' n = I™ff k . Assume condition (12.201) . Then 


1^ < ir < ■ < ^n/ 2 j ' 


( 2 . 22 ) 


Proof. Let k < |_ri/2j. If S is a subset of size k then S has k subsets of size k — 1. Thus, 
we write: 


k [a(S) : A(S C )] - £ ^ [*(£,) : X({Sjf) 
j&s 

jes 


- L 


X_(Sj) : (X(S' t "), Xj)] } . 


After direct computations, we obtain: 


(X(Sj),Xj) : A(5 C ) 


= i 


V>(SjUS c ) 


X(Sj) : A(S' C ) 1 + i% \Xj : X(S C )\X(Sj) 


and 


A(SA:(A(5 l ),AA 


= i 


4>(SjUS c ) 


X(Sj):X(S C ) +i% Xj : X_(Sj)\X{Sr) 


Here iZ 


x r . XiS^XiS,] 


,W 

’ V 


the proof of Theorem 3 from [5 


Xj : A"(5j)|A'(S'^) are mutual-conditional WEs emerging as in 


Xj : X(S^\X(Sj) 


= E 


lo f(Xj,X(S c )\X(Sj)) 

08 /(X j |X(5 i ))/(X(5C)|A(5 J )) 


f x , \ £( M f{Xj,x[S Z )\xfSj)) 

/ nx)f(x) log —— , \\ dx, 


xv 


f{xj\x{Sj))f{x{S z )\x{Sj)) 


(2.23) 


Xj : X(Sj)\X(S c ) 


= E 


log 


ttXj,X{Sj)\X(&)) 

f{Xj\X{&))f{X(Sj)\X{&)) 


= [ mn*) log .. /(x ^ (Sj)l£ _ (Sl!) L.. dx. 


xr 


/(.r j |x(5 c ))/(x(5 i )|x(5 c )) 


(2.24) 


In the remaining argument we will make an extensive use of definition (12.51) . employing WF 
if(S) for a number of choices of set S. 



































Using mutual-conditional WEs we can write: 


k ij [x(S) : X(S C )] - J2 % [^(^) = ^((^) C ) 


i&s 


J2{^[Xj :*(S C )|X(S,)] -i^Xj iX^)^)]} 

j'gs 

E [K(S)( X W S J)) - Kixms^xiSj)) 

jcs 

-^ ijusC) (Xi\ X(S C )) - KSiXjiXi&^XiSj)) 

E KsjWI^W)) - Wfets 0 ))' 


(2.25) 


j'gs 

Summing over all subsets of size k and reversing the order of summation, we obtain: 

) c )]j 


f 



E \ ki « 

X(S) : X(S C ) 

-EA 

SCI:#(S)=k ( 


J6S 


(2.26) 


= E E 

3 =1 SCI: #(S)=k,j£S 

The RHS of (J2.26I) can be rewritten in the following way: 


[^(jUSj) (Xj\X{S S )) - h^ ) {X j \X{!?‘)) 


E E 

3 = 1 S': #(S')=k-l,j<£S 
or equivalently 


- h 1«s’v j )* Jj )( x iW s ' u ^ c )) 


E 

3 = 1 


E '•J (s . u ,-,(X 3 |X(S)) - £ '>+( S »u )) ra*(S , "» 

S': #(5')=fc-l,5'C{i} c 5": #(5")=n-fc,5"C{j} c 


Since fc < 177./2J, then k — 1 < n — k. A set S'" with n — k elements has ( ) subsets of 

\k-l) 

size k — 1. Owing to Lemma 1.3 from [9j, for each such subset S C S ", under assumption (I2.20P 
we have that 


With the same argument as in [5] we conclude from (12.271) that 


(2.27) 


E < ki " 

SCI: #(S)=k 


X(S) : X(S C )] -E^[^):^i 
j&s 


> 0 . 


Then, since each set of size k occurs n — k + 1 times in the second sum, we can write 

k ]T i%(X(S): A(S C ))>(n —fc+l) E ^®5'):X(S /C )). 


SCI: #(S)=k 


S'Cl: #(S’)=k -1 


n 


Dividing by k { ) concludes the proof. 

. K . 
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3 Gaussian weighted entropies 

As we said in the introduction, the WDIs are connected with the Gaussian WE /i^(/q°) := 

— f ^(x)/q°(x) log /£J°(x)dx; cf. (11.31) . (11.51) . Throughout the paper we use a number of proper 

R d 

ties established in [9]. One of them is maximization of the WE h ^ (/) := — f 0(x)/(x) log /(x)d: 


x 


at / = /(J°. More precisely, consider the following inequalities 


/ 

Jw 


<?K X ) /(x) - /£°(x) 


dx > 0 


log 


(2vr)‘ i (detC)j f |/(x) - /c°(x) 

J J R d L 


dx + tr 


CT 1 


(3.1) 


< 0. 


Theorem 3.1 Let X = /(x), x E 6e a random vector with PDF f, mean zero and 

covariance matrix 

C = E c ((Xf) (X?) T ) = f xx T /£°(x)dx. 

R d 

* = E C ((X?) (Xf) T </»(Xf)) = j xx T ^(x)/ c No (x)dx 


Set: 


and suppose that (EH) is fulfilled. Then 

K(f) < 


(3.2) 


with equality iff f = /q° modulo <fi. 


The proof of Theorem 13.11 follows the argument in Example 3.1 from (9j repeated verbatim 
in the multi-dimensional setting. 

A conditional form of Theorem P is Theorem 13.21 below. The corresponding assertion for 
the standard entropy was noted in an earlier literature. See, e.g., Ref. [6, P. 1516]: the proof 
of Theorem 29, item (c), the reference to a conditional version of [6, Lemma 5]. The proof of 
Theorem l3.2l is essentially hinted in its statement (see Eqn (13.61) 1. and we omit it from the paper. 

Given a d x d positive-definite matrix C and p = 1,... ,d — 1, write C in the block form: 


C = 




(3.3) 


where and C p p are mutually transposed p x (n — p) and (n — p) x p matrices. Given 


x = 


Vt i 


set Dx p+1 = C p p 

— Y d 


(c 


d l^d 
p+l) 


x* and K p = C p - C 


>in—p 

J V 


( C P+i) 


-i pp 
'-'n— p" 


Correspondingly, if X = X^ is a random vector (RV) with PDF /x and covariance matrix C then 
represents the covariance matrix for vector X]’, with PDF / x p(x?)• Let X p+1 stand for the 

f (x 

residual/remaining random vector and set /x d +1 |xf( x p+il x i) = 7 —Hpw Also denote by N, 


/ X? K)' 
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and Np +1 the corresponding Gaussian vectors, with PDFs /n(x) = /q°(x), / n p(x^) = /^p(x^) 
and | N p(Xp +1 |xj > ). Finally, for a given WF x £ t->- fifx) set: 


< d 

S+i> 


V’K) = J ^(x)/ N d +i | N p( x p+il x ?) dx p 

8"-p 

( c i) = J ^( x i)/ n^( x i) dx i > a( c ) = J </>(x)/n(x) dx, 


(3.4) 


= 


x l( x l) T ^ x l)/N?( x l) dx l, = 


XX 


: ) ^( X )/ N f ( x ) d: 


X. 


Also, consider inequalities 


/ <M x )/x? K) [/x^ + 1 |X?( x p+ll x ?) - /n^ + 1 |n?( x p+iI x ?) 


!■ 


<K X ) /x( x ) - /n(x) 


log 


(27r) p det (K^ 


p+h i ' i i 
■P'v-l 


dx > 0, 


(3.5) 


+(loge) (x?-Dx d 


p+i 


(K?)- 1 (x? - Dx^ +1 


dx < 0. 


Theorem 3.2 Make an assumption that bounds (13.51) are satisfied. Then the following inequal¬ 
ity holds true: 


/^(X^_ 1KM 


a(C) 


Xn := - 


J 0(x)/x(x) log / X d +i | X p(x^ +1 |x?)dx 


<^(N^ +1 |N?) = ^(N)-^(N?) 


log 


(2vr) d det C 

«(C?' 


+ —^—tr [C ’$n] 


(3.6) 


log [(2vr) p det Cf] - ^fAr 




Nj 


4 Weighted entropies under mappings 

In this section we give a series general theorems (Theorems 14.II - 14.31 and Theorem 14.4p reflecting 
properties of the WEs under mappings of random variables (an example is a sum). Of a special 
importance for us is Theorem [T3] used in Section 5. In essence, Theorems 14.11 - 14)31 are repetitions 
of their counterparts from [9], and we omit their proofs. 

Theorem 4.1 (Cf. Lemma 1.1 from [9j.) Let (X,X,vx), (T, be a P a ^ r of Lebesgue 

spaces and suppose X, Y are random elements in (X,X), (T, 2)) and PM/DFs fx, fy, relative 
to measures ux, vy, respectively. Suppose p : (X,X) —>• (T, 2)) a measurable map onto, and 
that vy{B) = vx(ri~ 1 B), B £ 2J. Consider the partition of X with elements B(y) := {x € 
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X : rjx = y} and let vx{-\y) be the family of induced measures on B(y), y G y. Suppose 
that = / fx{x)v(dx\y) and for x G B{y) let fx\Y( x \y ) := 7 , N denote the PM/DF 


B{y) 


fv(y) 


of X conditional on Y = y. (Recall, fx\Y{'\u) a family of PM/DFs defined for fy-a.a 


y € y such that / G(x) fx{x)i'x{dx) = 



G(x)fx\Y(x\y)vx{dx\y)f Y (y)vy{dy) for any non- 


x y B(y) 

negative measurable function G.) Suppose that a WF x G X i->- 4>(x) > 0 obeys 


f 


4>(x)fx(x) fx\Y(x\rjx) - 1 u x (dx) < 0 


and set 


Then 


Hy) = / <!>{x)fx\Y(x\y)v(dx\y), y Gj^. 


(4.1) 


(4.2) 


B(y) 


hJ(X) > hfp(Y) := - / ip(y)fY(y) log f Y (y)ix(dy), or 


y 


h^(X\Y) ■= - <t>(x)f x (x) log f x \Y(x\y(x))i/ x (dx) > 0, 


(4.3) 




with equality iff <f(x) [fx\Y( x \r]x) — l] = 0 for f-a.a. x G X. 

In particular, suppose that for fy-a.a. y G y set B{y) contains at most countably many 
values and u( ■ \y) is a counting measure with (x) = 1, x G B(y). Then the value fx\Y( x \ r l x ) 
yields the conditional probability P(X = x\Y = yx), which is < 1 for fy-a.a. y G y. Then 
h/(X\Y) > 0 and the bound is strict unless, modulo f, map y is 1 — 1. 

Theorem 4.2 (Cf. Lemma 1.2 from [9].) Let (X,X,iy x ), {y,%),vy), (-2,3 ,vz) be a triple of 
SMSs and suppose X, Y, Z are random elements in (X, X), (T, 2)), (Z,3). Let fx be the 
PM/DF for X relative to measure u x and fy.z the joint PM/DF for Y, Z relative to measures 

fv,z(y,z) 


vy x vz- Further, set fz(z ) := / f(y,z)vy(dy) and f Y \ z {y\ 

y 


z = 


fz(z) 


Suppose that 


rr.(X,X)^(y, 2J), £ : (X, X) —> (z, 3 ) 


is a pair of measurable maps onto, and that 

vy(A) = vx{y- l A), Ae%), v z {B) = u x ((~ l B), B g 3- 

Consider the partition of X with elements B(y, z) := {x G X : r/x = y, (x = z } and let u x { ■ \y, z) 
be the family of induced measures on B(y,z), ( y,z ) G y x Z. Suppose that 

fv,z(y,z) = J fx{x)v x {dx\y,z) 

B{y,z) 
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and for x £ B(y,z ) let fx\Y,z(x\y,z ) := 


fx(x) 


denote the PM/DF of X conditional on 


fv,z(y,z ) 

Y = y, Z = z. (Recall, f x \Y,z (' |y, z) is a family of PM/DFs defined for f Y ^z-a.a {Vi z ) £3^x2 
such that 

J G(x)f x (x)vx(dx) = J J G(x)f x \Y,z(x\y,z)v x {dx\y,z)f Y ,z(y,z))vy(dy)vz(dz) 

X yxZB(y,z) 

for any non-negative measurable function G.) Assume that a WF x e-x (j>(x) > 0 obeys 


ix 


H x )f{x) fx\Y,z{x\yX, (x) - 1 u x (dx) < 0 


and set 


i>{y,z)= J <p(x)f x \Y,z(x\y,z)i>{dx\y,z). 


B(y,z) 


Then 


J ^(y, z)f Y ,z{y, z) log f Y \ z {y\z)vy{dy)u z {dz) 

x2 

=: hfp(Y\Z) < ti%(X\Z) := - J (/>(x)f x (x)log f x \z(x\(x)v(dx)-, 


(4.4) 

(4.5) 


(4.6) 


equality iff <f>(x) [f x \Y,z(x\yx, (x) - l] = 0 for f x -a.a. x £ X. 

As in 77teorem l4.il. assume B(y, z) consists of at most countably many values and u(x\y, z) = 
1, x £ B(y,z) for f Y) z-a.a. ( y,z ) £ y x Z. Then the value fx\Y,zi x \y, z) yields the conditional 
probability PpT = x\Y = y,Z = z), for f Y) z-a.a. y, z £ y x Z. Then h/(X\Z) > h/{Y\Z), 
with equality iff, modulo f, the map x i->- (rjx,Cx) is 1 — 1. 


Theorem 4.3 (Cf. Lemma 1.3 from |9j.) Let (X,X, v x ), (T, %),vy), (2,3 ,vz) be « triple of 
SMSs and suppose X, Y, Z are random elements in (X,3£), (y, 2)), (2,3). Let fx.y be the 
joint PM/DF for X, Y relative to measure v x x vy and set 

fv(y ) = j fx,Y(x,y)n x (dx), f X \ Y (x\y) = 
x 


Suppose that 


£:(Y,2J)^(2,3) 


is a measurable maps onto, and that 


vz(C) = MC'C), C £ 3. 
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Consider a partition of y with elements C(z ) := {y £ y : fy = z} and let vy{ ■ |z) be the family 
of induced measures on C(z), z £ Z. Given (x, z) £ X x Z and y £ C(z), let 

fx,z(x,z)= j fx, Y {x, y) l 'y(dy\z), fz{z) = J fx,z(x,z)u x ( dx), 


C(z) 


and 


f f IN fx,z(x,z) f Y (y) 

/x|z(lW “ ISD - ’ / ’' |zto|2) “ TSiT 


Assume that a WF (x, y) 4>(x, y) > 0 obeys 

[ 4>{x, y) [fx,r(x, y) - fz(€y)fx\z(x\£y)f Y \z(y\t,y)\ v x (dx)ny(d y) > 0 

Jxxy 


and set 


'4>{x,z)= / <t>{x,y)f Y \z{y\z)vy{&y\ z )- 

C(z) 


(4.7) 


(4.8) 


Then 


J ip(x, z)f x ,z(x, z) log /x|z(?/k)^Y(dx)z/z(dz) 




(4.9) 


=: hf{,(X\Z) > hJ(X\Y) := - / <f>(x,y)fx{x) log /x|y (a:| 2 /)^(dx)^(dy). 

Furthermore, equality in (14.91) holds iff X and Y are conditionally independent given Z modulo 
f>, i.e. 4>(x,y) [fx,r(x,y) - fz{fy)fx\z(x\^y)f Y \ z {y\f,y)] = 0 . 

(X A 


We will use an alternative notation /iY(X) := /rT(/x) where X = X f / = 


is a d- 


\X d ) 


dimensional random vector with PDF /x( x )- In this context, we employ the notation X /x • 

Y ~ fy, (X, Y) ~ / x , Y and (X|Y) ~ / X |y where / X |Y( x |y) = • 

Theorem 14.41 below mimics a result in (2j, extending from the case of a standard entropy to 
that of the WE. A number of facts are related to the conditional WE 

^( X I Y ) : = - / ^( x i y)/x,Y(x, y) log /x|Y( x |y)dxdy 

./R d xR d 

or, more generally, 

^(U|V) := - [ </>(u, v)/ u v (u, v)log /u| V (u|v)dudv, 

0 ./R d xR d 

Here a pair (U, V) is a function of (X, Y) with a joint PM/DF fjjy, marginal PM/DFs /u, 

/v and conditional PM/DF /uiv( u l v ) := „ V 7 ’ - . (Viz., U = Y, V = X +Y.) WF cf> may 

Jv(v) 

or may not be involved with the map (X, Y) 1 —> (U,V). 
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Theorem 4.4 Suppose X and Y are independent random vectors of dimension d, with PDFs 
/x and f Y : 

(X, Y) ~ /x,y where /x,y(x, y) = /x(x)/ Y (y), x,y € M d . 

Assume that WF (x, y) £ x — » </>(x, y) > 0 obeys 


/ 

R d xR d 


^(x,y)/v(y) /x(x) - /x+y(x + y) dxdy > 0 


(4.10) 


and set 

0(v) = J 0(v-y,y)/ Y |x+Y(y| v )dy, 6>*(x) = J </>(x + y,y)/ Y (y)dy, v,x G M d . (4.11) 

R d R d 

Then 


hg (X + Y) > (X), (4.12) 

with equality iff 0(x, y)/ Y (y) /x(x) - /x+y(x + y) =0 /or Lebesgue-a.a. (x,y) G x M d . 
Proof. Set: </*(x, y) = </(x + y,y). The following relations (a)-(c) hold true: 

(4.13) 


(a) ^(X + Y)>^(X + Y|Y), 

(b) ^(X + Y|Y) = ^(X|Y), (c) /^(X|Y) = /^(X). 


Here bound (a) comes from the sub-additivity of the WE, see [9], Theorem 1.3 or Eqn (1.31) 
from [9] . Next, (b) is derived by applying the following equations: 


K 


(X + Y|Y) = J / Y (y)/i^(X + Y|Y = y)dy 


<H X + y, y)/v(y)/x|Y(x|y) log /x |Y(x|y)dxdy. 


Finally, Eqn (c) holds because X and Y are independent. 

The proof of Theorem 14.41 is completed by observing that 

(X|Y) = - J (/(x + y,y)/x,Y(x,y)log/ X |Y(x|y)dx)dy 


R d xR d 


<M X + y,y)/y(y)dy 


/x(x) log / X (x)dx. 


Remark 4.5 The assertion of Theorem 14.41 remains valid, mutates mutandis , when X and Y 
have different dimensions. Viz., we can assume that Y has dimension d' < d and append Y and 
y with zero entries when we sum X + Y and x + y. 
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5 Miscellaneous weighted determinant inequalities 

In this section we present a host of WDIs derived from properties of the WEs. As we said before, 
the proposed inequalities hold when WF <fi = 1 (in this case the stated conditions are trivially 
fulfilled). To stress parallels with ‘standard’ DIs, we provide references to [2] or [5] in each case 
under consideration. 


Theorem 5.1 (Cf. |2j Theorem 2.) Let X, Y be independent d-variate normal vectors with 
zero means and covariance matrices C 1; C 2; respectively: /x,Y( x 5 y) = /x( x )/Y(y)> x ,y € 
where /x = /q(, /y = fc°- Given a WF (x, y) € M. d x /->• </>(x, y) > 0, positive on an open 
domain in M d x consider a quantity (3 and d x d matrices 0, ©*: 

P = J 6, ( x )/c 1 °+c 2 ( x ) dx > © = j xxT ^( x )/ci+c 2 ( x ) dx > & * = J xxT0 *( x )/ci( x ) dx (5-1) 

R d R d R d 

where 6 and 9* are as in (14. lip : 

<?( x ) = /^(z, x - z )/y|x+y( x - z l x ) d z, <9*(x) = J 0(x + y,y)/ Y (y)dy. (5.2) 

R d R d 


Assume the condition emulating (|4.10l) : 


<K x ,y)/c°(y) /c°( x ) - /ci+c 2 ( x + y) 


f No 


No 


dxdy > 0. 


(5.3) 


Then 


/Slog 


det(Ci + C 2 ) 


+ (log e) {tr [(C x + C 2 ) —x ©j - tr (C^ 1 ©*)} 


det Ci 

Proof. Using Theorem 14.41 and Eqn (11.31) . we can write: 


> 0 . 


(5.4) 


1 


log 


(27r) rf (det (Ci + C 2 ))l [ 0( x )/c°+c 2 ( x ) dx + ^Ar (C x + C,)" 1 © 


>2 lQ g 


(2vr) d (detCi) f 0*(x)/ Cl ( x ) dx + ^|Ar C x 1 0*. 


The bound in (15.4|) then follows. 


Remark 5.2 It is instructive to observe that (15.41) is equivalent to: 

P log [det (I+ 0 ^ 02 )] 

+(loge)tr [(Ci + C 2 ) _1 0* - C^ 1 ©* + (Ci + Ca)- 1 © 

where 

© = / ( x y T + y xT + yy T ) <K x + y, y)/c 2 °(y)/c°( x )dyi 


> o 


(5.5) 


x. 


This claim is verified by observing that © = 0* + 0. 
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Remark 5.3 As above, we can assume that C 2 is a matrix of size d! x d 1 , agreeing that in the 
sum Ci + C 2 , matrix C 2 is identified as a top left block (say). This is possible because in Eqns 
(15.41) and (15.5p we do not use the inverse C ^ 1 or the determinant det C 2 . 

To this end, recall the following theorem from [7j: 

Theorem 5.4 Let G and G + E be nonsingular matrices where E is a matrix of rank one. Let 
g = tr (EG -1 ). Then g 7 ^ — 1 and 

(G + E )" 1 = G ” 1 --—G 1 E G _1 . 

1 +9 

The above equation is essentially the Sherman-Morrison formula (see 03 , P- 161). 


Assuming that C 2 
following bound: 


E has rank 1 and letting g = tr (EC 1 x ), inequality (15.41) turns into the 


fd log 


det (Ci + E) 
det Ci 


+ (log e) 


—tr 


C ( j EC ‘ ®*) + tr {( c i + E) -1 Q}) 


1+9 


> 0 . 


(5.6) 


The techniques developed so far allows us to prove Theorem 15.51 below rendering a weighted 
form of Szasz theorem. Suppose C is a positive definite d x d matrix. Given 1 < k < d and a 
set S C := {1,..., d} with ff(S) = k , denote by C(S) be the k x k sub-matrix of C formed 
by the rows and columns with indices i G S. With every S we associate a Gaussian random 
vector X(S') ~ /c°s) considered as a sub-collection of X ~ /£J 0 . Accordingly, conditional 

PDFs emerge, /^ ; (x(S')|x(S")), for pairs of sets S 7 S' with S D S' = 0, where x(S) G 
x(5') G [The PDF /|j°, is expressed in terms of block sub-matrices forming the inverse 

matrix C(S U S") -1 .] 

Further, let a function </>(x) > 0, x G M rf , be given, which is positive on an open domain in 
and set, as in (12.51) . 


V’(5’;x(5))= J 0 (x)/^° s (x(S ,i: ) |x(S’))dx(5 C ). 

K#(« C ) 


Furthermore, define: 


t(S) = tr [C(5) - 1 $(5)], T(k) = £ r(5) 

SC/( d ):#(S)=fc 


where matrix ^(S) is given by 


*(S) = *(C(S))= J x(5)x(5) i V’(^;x(5))/^ s) (x(5))dx(5). 

R#(S) 


(5.7) 


(5.8) 


(5.9) 


(For S = /W, we write simply 4>; cf. duo Finally, set: 


a(S) = a(C(S)) = j ^(S;x(S))/£fo(x(S))dx(S), A (k) = £ “( S ) ( 5 - 10 ) 

R is) SClW:#(S)=k 
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and 


A(S) = a(S) log detC(S), A (k) := A (S). 

SClW-.#(S)=k 

Consider the following condition invoking broken dependence and analogous to 
V i € S C I, with Sfi = {j £ S : j < i\ and S+ = {j € S : j > i}, 

J ' 0 (<s , ; x (‘S , )){/c( 5 ) (x(S')) 


(5.11) 


(M#(S) 


(5.12) 


No / 

C (Sry 


( S i )) x f{$\s~( Xi \ x ( S i ))fs+\Sr^ S i~^ S i )) } dx ( 5 ') > 0 


Theorem 5.5 (Cf. [2], Theorem 4 or (5], Theorem 31) Assume condition (15.121) . Then the 
quantity m(k) = m(k, C, fi) defined by 


m(k ) := 


-i r 




is decreasing ink = 1 ,,d: 


m( 1) > ... > m{d). 


(5.13) 


Proof. For X(S') ~ /c(S) we have, by using (11.31) : 

h * ( A* {S>) = AA log [(2*)*detC(S)] + !^tr [C(2J)- 1 *(S)] 


Therefore, 


m(k) = 


-i 


E 


k) ' | 2k 

5:|5|=fc 


(2vr) fc detC(S) 


+ ^tr (C(S)-‘*(S)) 


Invoking Theorem 12.21 completes the proof. 


Theorem 5.6 (Cf. [2], Theorem 5 or l[5], Theorem 32) Assuming (15.121) . for all r > 0 the values 


s(k ) = s(k, C, 4>) := 


-i 


A(fc) 1/fc exp 


SCl(d): #(5)=fe 




obey 


s(l) > ... > s(d). 


(5.14) 
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Proof. The assertion follows readily from Theorem 12.31 ■ 

Our next goal is to establish bounds for Toeplitz determinants extending Theorem 6 from 
[2] (or Theorem 27 from [5]). It is said that C = ( Cij ) is a d x d Toeplitz matrix if Cij = C k i 
whenever \i—j\ = \k —1\. A more restrictive property is cyclic Toeplitz where C t] = C k i whenever 
dist d(i,j) = distd(£i, 1). Here, for 1 < i < j < d the cyclic distance dist^(z, j) = min \j—i, d—j+i]; 
it is then extended to a metric with dist^(i,j) = distd(/, i) and dist^(z,i) = 0. As before, we 
consider sub-matrices C(S) where S C C d '> := {l,...,d} and the Gaussian random vectors 

X(S) ~ /No as sub-collections in Xf : = 


/q°. A special role is played by S = I, 






where I l:J stands for a segment of positive integers {i,i + 1,...,/} of cardinality j — i + 1 where 
1 < * < / < d. In particular, for S = Ii tk , we set: C(S) = C k and deal with vectors X^ ~ 

1 < k < d, with Crf = C. 

Accordingly, we say that WF x E i —> <f(x) > 0 has a Toeplitz property if the value of the 

reduced WF ^(/jj;x^) coincides with ^(Ii+kj+k', xF^), provided that arguments xj = x(/jj) 

and = x(/j + fcj_)_fc) are shifts of each other, where 1 < i < j < d and l<i + k<j + k<d. 

An example is where C is cyclic Toeplitz and <f> has a product-form: <f(x) = <p(xi). Recall, 

l<i<d 


the reduced WF in question involves the conditional PDF ff c °. (x(i)' -)[x?): 


t C 


‘•I’&j'M) = / <^(x)/ N c ° (x(ijj-)|^)dx(if ■) where if -= 

J % <3 

Rd-j-M-l 


For S' = /1 < < d, in accordance with 


W x f) = W)(Xi') = 


a(C A 


log 


(2vr) fc det C fc 


+ [Cj 1 ® J 


( 5 . 15 ) 


Here the value a(C k ) = a(C k , C,</>) and the k x k matrix = \Ffc(Cfc, C,tf) are given by 

“(Cfc) = y^(fc;xi)/c°(xf)dxf, * fc = /xf(^) V’(^; x i)/c°( x i)dxf ( 5 . 16 ) 


and ^>(fc) = ip(Ii t k)- (For k = d, the subscript k will be omitted.) 

Theorem 5.7 (Cf. [2], Theorem 6 or [S], Theorem 27) Suppose C n is a positive definite d x d 
Toeplitz matrix and f has the Toeplitz property. Consider the map k £ {1,... , d} i->- a(k) = 
a(k, C,(j>) where 


a{k) = a(C k ) |log(2vr) + log (det C k ) 1/k j + ^jptr [C fc 1 ^ fc ] 


(5.17) 


Assuming condition (|5.12p . the value a(k) is decreasing in k: a(l) > ... > a(d). 
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Proof. By using the Toeplitz property of C and (f>, we can write 


= %2,, +1 )(^+il X 2)- (5- 18 ) 

Next, Theorem 14.31 yields: 

£ '>)«/,«. W«l x h ( 51£ >) 

From (15.1811 and (|5.19l) we conclude that ^(/ lfc )POc|X? —1 ) is decreasing in k. Thus the 
running average also decreases. On the other hand, by the chain rule 

£ W x *> - £X>W*‘i x ‘f‘>. 

i =1 

Consequently — ^(j lfc )(Xi) too decreases in k. Referring to Eqns (15.161) and (15.151) leads 

directly to the result. ■ 


Theorem 5.8 (Cf. [5], Theorem 33.) Given a WF x = 


f Xl \ 


i—^ <(>(x) ; assume condition 


w 


[ <K X ) 

n 

/o°(x)-n/s°fe) 

J 

R d 

2=1 


dx > 0. 


(5.20) 


Then the quantity 


w(k) = w(k , C, (j)) = ffj log 


+ 


d\ 1 log e 


n 


(2vr) d (det C) 


S c In : MS)=k ^ d fe (detC(5C)) 


k) 2k 


Y {tr[C _1 *]-tr C(5 C ) -1 *(5 C ) } 


SCI n : #{S)=k 


is increasing in k, with 


u;(l) < • • • < w(d). 


(5.21) 


Proof. Using the conditional WE, we can write 

/^(X(S)|X(S C )) = ^(X(5),X(S C ))-h- sC) (X(5 C )) 


«(C) 

2 

a(C) 


log 

log 


(27r) d (detC) 




(27r) d - fc (detC(5 C )) 


log e 

H-tr 


C (5 C )- 1 ^(5 U ) 
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Here a(C) = f </>(x)/£°(x)dx = / 't/j(x(S C ))f^° c Ax(S C ))dx(S C ). Therefore, 

R d R#(S C ) 


/^(X(S)|X(S C )) 


a(C) 


log 


(27r) d (det C) 
_(27r) d-fc (det C(S^))_ 


+ 


log e 


|tr[C _1 *]-tr C(5 C ) _1 ^(5 C ) }. 


After that we apply Theorem 12.41 which completes the proof. 


(5.22) 


Remark 5.9 Note that the outermost inequality, w( 1) < w(d), can be rewritten as 

d 


a(C)log [(27r) rf (det C)] + log e tr [C >cr(C)log 

d 


n 


27r(det C) 


+ 


[t=\ det C^” 1 U If +l )_ 
log e £ {tr [C- 1 ^] - tr [c^- 1 U U lf +1 )] } . 


2=1 


(5.23) 


Our next goal is to establish additional WDIs by using Theorem 12.61 For this purpose, we 
first analyse the mutual Gaussian WE, i^(X(S’) : X(S^)). According to the definition of the 
mutual WE in [9], we can write 

i£(X(S) : X(s c )) = h^ S) (K(S)) - ^(X(5)|X(5 C )). 

Then, in accordance with (15. 2211 . we have 


^(X(S):X(S C )) = ^log 


+ 


log e 


(detC(S)) (detC(S c ))' 
(detC) 


tr [C(S) -1 $(S)] +tr C(5 C )" 1 ^(5 lj ) -tr[C _1 $' 


In Theorems 15.101 and 15.111 we consider the following condition (15.241) stemming from (12.201) : 
VSC{l,...,n} with # S > 2 and i, j G S with i ^ j, 


J <A(x)/$j s (x(S c )|x(S)) [/c(s)(x(5)) 


(5.24) 


fc°s\{i,j})(*( S \ {*» J'}) fi\s\{ij}M s \ {*’■?}) 4is\{ij}( x il x ( 5 \ {*» J'}) 


pNo 


No 


dx > 0. 


The proof of Theorems 15.101 and 15.111 is done with the help of Theorem 12.61 assuming that 
Xi,X 2 ,..., X ( i are normally distributed with covariance matrix C. 

Theorem 5.10 (Cf. [5j, Theorem 34.) Assume condition (I5.24|) . Let 


,'d\ ~ 1 a(C) 

“ (fc) = (J ~iir log 


+ 


d\ 1 log e 
2k 


n 

SC/( d ): #(S)=jfc 


(detC(5)) (detC(5 C )) 
(detC) 


Y {tr [C(S) _1 $(S)] +tr C(5 C )" 1 ^(5 C ) — tr [C^ 1 ^] | . 


SC/„: #{S)=k 
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Then 


u(l) > u{ 2) > ■ ■ ■ > u{d — 1) > u{d). 
Theorem 5.11 (Cf. [5], Theorem 35.) Under condition (|5.24D . let 

(detC(S')) (detC(S C )) 


(5.25) 


2( * ) = (0 ^ log 


+ 


d\ 1 log e 


n 

SClW: #(S)=k 


(detC) 


Y |tr [C(S')~ 1 $(5')] + tr C(5 C )- 1 $(5 C ) -trfC" 1 ^}. 


SClW: #(S)=fc 


Then 


z{ 1) > z( 2) > ••• > z(\d/2\). 


(5.26) 


6 Weighted Hadamard-type inequalities 


In this section we group several results related to the weighted Hadamard inequality (WHI); cf. 
|9|, Theorem 3.3. The WHI inequality asserts that for a d x d positive definite matrix C, under 
condition (15.201) we have: 


a(C) log C a + (log e) Y ~ «( c ) lo g det c “ ( lo g e)tr C > 0, (6.1) 

i i 


with equality iff C is diagonal. Recall, a( C) = a^(C) and <l> = <l>c = &c,<t> are as in (11.41) . 

We begin with the weighted version of the strong Hadamard inequality (WSHI). The in¬ 
equality (and other bounds in this section) will involve determinants detC(S’) of sub-matrices 
C(S) in C where, as before, S’ is a subset of 1^ := {1,..., d} of a special type. Namely, we fix 
p £ {1,..., d — 1} and consider the segment I p +i,d = {p + 1,•• •, d}, segment = {1 ,p} 
and unions {i} U I p +i 4 and I\^ U I p +i 4 = 4"+i p where i £ I\ )P . We deal with the related entry 
Ca in C and sub-matrices 

Cj +1 :=C(I P+M ), C) _1 := C(/ li j_ 1 ), C({i}U/ p+M ) and C(I u Ul p+14 ) 

and Gaussian random variables X t and vectors Xp +1 := X(I p+ i 4 ), X’j -1 := X(/i i j_i), 

Xi V Xp +1 := X({i} U I p+ i 4 ) and X) V X^ +1 := X(/ M U I p+ \ 4 ) using symbols x i: x^ +1 , x ? f 


and x* x V x)) +1 for their respective values. Thus, PDFs 


^ + 1 ( x p+i) - /c° + 1 ( x p+i) and /x-vx J +1 ( x i Vx p+i) - /c(/i,i_iU4 + i jd )( x i Vx p+i) 


No 


"'p+l 


emerge, as well as conditional PDFs fxi\x. d +1 ( x i\ x P +i ) and / x i - 1 jx d ( x i 1 | x p+i)- Viz., X* x V 



/ x, \ 


/ Xl ^ 

X)) +1 and x^ V Xp +1 stand for the concatenated vectors 

Xi 

X p + 1 

and 

Xi 

%p-\~ 1 


l Xd J 


\x d J 


each with 
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i + d — p entries. As above (see (11.41) ). for a given WF x € W l > <^>(x) we consider numbers 
«(C?) = 0 ^( 0 ^) and matrices $ c p = &c%,C,<f> : 

«(C?) = M C i) = f ^( x i)/cf ( x i) dx ’ = ^Cj,c,</) = f x i ( x ?) T ^( x )/c°( x ) dx - 

R d R d 

(In Eqns (16.161) and (J6.22I) - (|6.24j) we will use variations of these formulas.) We also set 

, r , / , \T 

= 


x p+i (x* +1 ) ^(/p+gdJJCp+i) / X d +1 (Xp +1 ) dx p+1 , 

RP-ti 

*({*} U /p+i,d) = f {x t V x^ +1 ) (x ?; V xj +1 ) 


( 6 . 2 ) 


P—d+1 


x^({i} U /p+i,d;ari V x“ + 1 )/ x . vX d +i (x; V x“ + 1 )d(x* V x“ +1 ), 


with reduced WFs ij){I p+ i,d) and if({i} U I p +i,d) calculated as in (12.51) . for S = I p +\.d and 
S = {+ U / p+ i id . 

Furthermore, we will assume in Theorem 16. II that. V i = 1 ,... ,p, the reduced WF ip(S) with 
S = {1, ■ ■ • i,P + 1, ■ ■ ■ d} = I^ +l p obeys 


J l,pi x l V Xp + 1 )|/ x i vX d +i (x * 1 VXp +1 ) 

d 


^i+d—p 


"^ +1 ( X P+ 1 ) X [/5Hx| +1 ( a; il x P+l)/xi- 1 |X| +1 ( x l 1 ! x P+l)]} d ( x l Vx p+i) >°- 


(6.3) 


P+1 

The ‘standard’ SHI is 


det C 


< 


n 


detc hi i<i<p 


det C({t} U Ip+irf) 

detCj +1 


or 


log det c + (p - 1) log det C d +1 < ^ log det C({z} U I p+1>d ). 

1<2<P 


(6.4) 


The WE approach offers the following WSHI: 

Theorem 6.1 (Cf. |2j, Theorem 8 or [5], Theorem 28.) Under condition (16.31) . for 1 < p < d, 

a(C)log (27r) d detC + (log e)tr (C _1 $) 

+(p-l){a(Cj fl )log {2ir) d ~ p det C d +1 + (log e)tr [(C^i) -1 *^]} 

< E {«(C(»U/ p+M ))log [(27r) d -P +1 detC({i}U/ p+M ) 

1 <i<p ^ 


(6.5) 


+ (log e)tr [C({+ U I p+ i,d) ^({+ U I p+ i,d)\ } ■ 


Proof. We use the same idea as in Theorem 3.3 from [9j. Recalling (16.91) we can write 


w/x-PlYd _ _ | Q g 


^(X? |X p+1 
-^l°g 


(27r) d detcj a(C) + i^tr(C' 1 ^) 
log e 


(2n) d ~ p det C d +i a(C‘‘ +1 ) - —|—tr [(C^,)- 1 *^,], 
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Cf. Eqns (15.91) . (15.101) . (15.161) . Furthermore, by subadditivity of the conditional WE (see [9], 
Theorem 1.4), under assumption (16.31) we can write 

p 


2=1 


( 6 . 6 ) 


Here for i = 1,... ,p, again in agreement with (16.91) . 


1 


= 5 lo S [(2 Ir )‘‘-' ,+1 det C({i} U J p+M )j o(C({,} u / P+M )) 


"| lo 8 


+ lo|_e tr C ({.} u + | y J p+l d ) 


det C p+1 ] a(Cp +1 ) - !?§Ar [(C^,)- 1 *^,] 


Substituting into (16.61) yields the assertion of the theorem. ■ 

Our next result, Theorem 16.21 gives an extension of Lemma 9 from |2] (or Lemma 8 from 
0). The latter asserts that an individual diagonal entry Ca of a d x d positive definite matrix 
equals the ratio of the relevant determinants, viz., 


Cdd = . , or log C dd + log det Cf 1 - log det C = 0. 

det C“ _1 

Remarkably, Theorem 16.21 does not require assumption ()6.3I) . 


Theorem 6.2 (Cf. (2], Lemma 9 or [5], Lemma 8.) The following equality holds true: 
ot{C dd ) log [(2tt)C m ] + a(Cf _1 ) log [(27r) d_1 det C^ 1 ] - a(C) log [(27r) d det C] 


= (log e) tr [CT 1 #] - (log e) tr (c^ 1 ) 1 


- (log e) Cj$dd 


Proof. Using the conditional normality of X d given 1 , we can write 


OiiC dd ) 


K%(X d ixf- 1 ) = log [ (2 T )cy + ^ C 2 


dd 


On the other hand, 

and therefore 

a{C dd ) 


hJ(X d \ Xf 1 ) = hJ(X f) - h^ It d _ l) (X‘- 1 ) 


d- 1\ 


1°§ e ^2 


2—^ [WC 2 dd ] + ^ C' dd - 
= log [(2vr) d det C] + tr C$ 
_o(C^) bg [(27r) d-i det C d-i] _ 


log e 


tr 


'irf— 1 ^( rf —!) 


(6.7) 


( 6 . 8 ) 


(6.9) 


The result then follows. 
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The next assertion, Theorem 16.31 extends the result of Theorem 9 from [2] (or Theorem 29 
from [5]) that, V p = 1,..., d, Ci-> log ———y- is a concave function of a positive definite d x d 

d.6t O j 

matrix C. We will write matrix C in the block form similar to (13.31) : 


r<p r>P 

_ / '“'1 n—p 

V — I n n-p n d 


( 6 . 10 ) 


Set DxIl, = Ci p (Cp +1 ) x x)) +1 and &{ = C^ — Cp p (C^ +1 ) 1 C p d _ p . Consider the following 


p +i 
inequalities 


J <K x )/x?( x i) / X d +i | X f( x p+il x ?)-/ y^ +1 |y^ x p+iI x i) 


dx > 0 


( 6 . 11 ) 


and 


/■ 


0(x) /x(x) - /c°(x) <{ log (2vr) p det (Bf) 


,-i 


+(log e) 


x? - Dx d 


■p+i 


(B?) 


^ 1 ^x^ — Dxp +1 


( 6 . 12 ) 


dx < 0. 


Theorem 6.3 (Cf. [2j, Theorem 9 or [5], Theorem 29.) Assume that C = AC , +(1 —A)C 7/ where 
C, C' and C" are positive definite d x d matrices and A E [0,1]. Given a WF x i-» <^>(x) > 0 
and l < p < d, define: 


/r(C) = a(C) log (27r) <i det C + (log e) tr [C x $c] 


—a(C p ) log (27r) d det C p — (log e) tr (C p ) 1 <f> 


'C? 


(6.13) 


and similarly with fi( C') and / u(C ,/ ). Then 


A»(C) > A/i(C') + (l-A)/i(C"). 


(6.14) 


Proof. Again we essentially follow the method from [2] with modifications developed in 
[9]. Fix two d x d positive definite matrices C' and C" and set X' ~ /^,°, X" ~ /^°. Given 
A E [0,1], consider a random variable 0 taking values $ = 1,2 with probabilities A and 1 — A 
independently of (X',X"). Next, set 


X = 



when 0 = 1, 
when 0 = 2. 


Then X ~ (A/q,° + (1 — A)/q°) and the covariance matrix CovX = AC' + (1 — A)C" =: C. 

With the WF 0(xf,i?) = (j>(zf), use Theorem 2.1 from [9] and Theorem 13.21 from Section [3] 
and write: 



(6.15) 
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Here Y stands for the Gaussian random vector with the PDF /Q 0 (xf). The LHS in (16.151) 
coincides with \p(C) + (1 — A)/r(C") and the RHS with //(C). This completes the proof. ■ 


In a particular case p = d — 1, the function C 


det C 

-j- is also concave (see [2], Theorem 


det C 


i 


10). The weighted version of this property is encapsulated in the following result. For a positive 
definite d x d matrix C and a WF x i —> ^(x) , set: 


w^(C) := a log 


(2vr) d det (C) 






log (2vr) d - 1 det (C^ 1 ) 


2 

d- U 


log e 


tr 




(cr 1 ) 


1 

i ’ 


(6.16) 


27rdet 

Remark 6.4 When £:(xf) = 1, the expression for w^(C) in (I6.16P simplifies to log -—p 

det C x 

The aforementioned concavity property from [2], Theorem 10 (or from [5], Theorem 30), is 
essentially equivalent to the following subadditivity-type property: 


log 


27rdet (A + B) 


det (Af- 1 + Bl 


*d —1 N 


, 27rdet A 27rdet B 

> log , , TdZI + log 


det Af 


det B“ 


d -1 ' 


The WE-version of this property is more involved: see Eqns (16.171) - (I6.19|) . A crucial part 
is played by Lemma 14.31 with X represented by the random variable Zj rsj fA° dd +B dd and Y is 
associated with the independent Gaussian pair of vectors (X^ 1 , Y^ x ) having the joint PDF 


ixf -1 y? _i i x i >yi ) — /A<i-i( x i )/ B d-i(yi ) 

l ’ l 13 j 


The random element Z from Theorem E is represented by Tf\ 1 , and the map £ takes 
(x^y^^x^+y?- 1 . 


Theorem 6.5 (Cf. (2j, Theorem 10 or [5], Theorem 30.) Let A. B be two positive definite 
d x d matrices and X fl°, Y ~ /g° 6e the corresponding independent Gaussian vectors, with 
Z := X + Y ~ Consider a WF 

(zd, x^ -1 , yf -1 ) € R x x i-)- £>(zrf, x^ _1 , yf -1 ) and assume the following inequality 

involving conditional normal PDFs fz^ypi-t Y d_1 an> ^ fz d \z d ~ i: 


J 


^(^,xr\yf^)/^_ 1 (x?^)/^_ 1 (y?- i )^4 d|x ,- lYf - 1 (^|x) i -\y?- i ) 


(6.17) 


~f Zd 1 +yf ^Jdzrfdxf V? 1 >0. 


d—l- r d—l 


Then 


w^(A + B) > w x ( A) + Wry( B). 


(6.18) 
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Here 


tp( z i) = I 4>{z d — y d i z i 1 -y1 x ) 


i-h/rK-yfl/B No (y?),i 




-dy?, 


x( x i) = J V>( x i + yi)/B°(yi)dyi, 7( x i) = J V>( x i + yi)/A°(yi)dy?. 


(6.19) 


Proof. As in [2], we use basic properties of Gaussian random variables. Assume X rs_/ /. 


No 


and Y ~ /g are independent Gaussian random vectors and set Z = X +Y rs_/ /a+b- By virtue 
of (13.61) and Theorem 14.31 we can write: 

h%(Z d izf 1 ) = ^(Z) - hy,( Zf 1 ) = tu(A + B) > ^(ZdlX^.Y?- 1 ). (6.20) 


Next, owing to independence of X and Y, the conditional WE h^{X ( i + Y d |X^ , Y, ) equals 
the sum 

[ Xi _1 (xi _1 )/xj- i ( x i _1 ){^ 1 °g ( —^=ry J [ Xd(x)f Xd \ X d-i{x\xf~ 1 )dx 

J -1 1 ^ 

+ l ^^ A dd 1] f x 2 Xd(x)f Xd 

R 

-^ly) f r rd(x)f YdlY d-i{x\xf- 1 )dx 

3 dd J I 


+ / 7r i (xr i )/ Yf -i(x*-^-io g 


( 6 . 21 ) 


+ 


log e g( _i) 


dd 


J x 2 7 d(x)/ yd | Y d-i(.T|xf x )dx 


(The fact that and are scalar Gaussian variables is crucial here.) 
The first summand equals 


1 


log 


2vr 


A 


(- 1 ) 

dd 


1 J x( x i)/xf( x i) dx i + ~TT A dd 1] J x( x iV'd/xf( x f) dx i 

' »d-l lRd-1 


( 6 . 22 ) 


and coincides with 


«x(A) 


/*j(^ixr) = 2 
, < 2-1 


log 


(27r)' i det A 


a x y-!( A i ) 


■log 


(27r) d_1 det A) 4 


< 2-1 


+ k££ tr A-‘* A 
2 

log e 


tr 


x 

< 2 - 11-1 


( A i ) =: rox(A) - 


(6.23) 


Similarly, the second summand coincides with 

(2-7r) d det B 


h”(Y d | Y^ 1 ) = log 

><2—1\ 


log e T 

H-x—t r B 1 $B,7 




log 


(2vr) d - 1 detB? 


< 2-1 


2 

log e 


tr 


(Bf 1 ) =: w 7 (B). 


(6.24) 


We therefore obtain the property claimed in (|6.18l) : w^(A + B) > w x ( A) + ro 7 (B). 
Finally, combining (15.231) and (16.11) . we offer 
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Theorem 6.6 (Cf. [5], Corollary 4) Given a dx d positive definite matrix C, assume condition 
(15.201) . Then 


a (C) log 


27r(det C) 


|J3 det Cf/J- 1 U If, 


+ log e^jtrlC- 1 *] -trfcf/f-'u/iJ-^f/r'U/i,)]} 
i —1 

< a(C) log det C — (log e)tr C _1 4> < a(C) log Cu + (log e) ^ C^ 1 ^. 


(6.25) 
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